Synergy: A HW/SW Framework for High Throughput CNNs on Embedded Heterogeneous SoC

نویسندگان

  • Guanwen Zhong
  • Akshat Dubey
  • Tan Cheng
  • Tulika Mitra
چکیده

Convolutional Neural Networks (CNN) have been widely deployed in diverse application domains. There has been significant progress in accelerating both their training and inference using high-performance GPUs, FPGAs, and custom ASICs for datacenter-scale environments. The recent proliferation of mobile and IoT devices have necessitated real-time, energy-efficient deep neural network inference on embeddedclass, resource-constrained platforms. In this context, we present Synergy, an automated, hardware-software co-designed, pipelined, highthroughput CNN inference framework on embedded heterogeneous system-on-chip (SoC) architectures (Xilinx Zynq). Synergy leverages, through multi-threading, all the available on-chip resources, which includes the dual-core ARM processor along with the FPGA and the NEON SIMD engines as accelerators. Moreover, Synergy provides a unified abstraction of the heterogeneous accelerators (FPGA and NEON) and can adapt to different network configurations at runtime without changing the underlying hardware accelerator architecture by balancing workload across accelerators through work-stealing. Synergy achieves 7.3X speedup, averaged across seven CNN models, over a well-optimized software-only solution. Synergy demonstrates substantially better throughput and energy-efficiency compared to the contemporary CNN implementations on the same SoC architecture.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A comprehensive integration infrastructure for embedded system design

A System-on-a-Chip (SoC) is the most successful example of how the evolution of the chip integration technology allows the manufacture of complex embedded systems. However, the bulk of the design effort, to efficiently combine the HW and SW components in a SoC, still resides in the HW/SW interfacing architecture. A good HW/SW integration strategy has a positive impact either in performance, eff...

متن کامل

Operating System Abstractions of Hardware Accelerators on Field-programmable Gate Arrays

Traditionally, one of the main functions of the Operating System (OS) is to abstract the programming model from the low level details of the specific HW platform resources. However, in an FPGA-based SoC with HW accelerators, even with an OS layer, there is no unified HW/SW framework that provides: 1) transparency to the SW designer at the application level; and 2) an interface and OS support fo...

متن کامل

A Trace-based Workflow for Evaluating Application-specific Memory Bandwidth for FPGA-SoCs

FPGA-SoCs such as Xilinx’s Zynq-7000 and Altera’s Cyclone V SoC provide a great platform for HW/SW-Codesigns. These devices combine a powerful embedded processor with programmable logic similar to that found in FPGAs. For the communication between both parts, FPGA-SoCs provide various interfaces in the logic component that offer access to the DDR memory used by the processor. While high through...

متن کامل

P-Ware: Performance-Aware Transaction-Level Simulation for Network Processor Applications

Platform-based design is an approach to cope with increasing costs in developing complex embedded systems. In order to support performance analysis at system-platform level, this report presents a methodology and tool which provide a joint SW/HW component-based modelling and simulation framework. Our framework allows for specifying variable transaction latencies, and separates functional and ti...

متن کامل

System Level Distributed Cooperative Design of Media SoC Using Application Profiling

Heterogeneous multi-core architectures of System-onChip can support various embedded real-time applications well. SoC design is very complex for multi-fields experts to collaborate on application analysis, system decision and hw/sw co-design. However, existing SoC design methods and environments can only support human-computer interaction, ignoring the collaboration interaction between multi-fi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018